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Sampling Strategies for Finite Population Using Auxiliary Information 


Abstract 


This paper deals with the problem of estimating the finite population mean when some 
information on two auxiliary attributes are available. A class of estimators is defined which 
includes the estimators recently proposed by Malik and Singh (2012), Naik and Gupta (1996) 
and Singh et al. (2007) as particular cases. It is shown that the proposed estimator is more 
efficient than the usual mean estimator and other existing estimators. The study is also 
extended to two-phase sampling. The results have been illustrated numerically by taking 
empirical population considered in the literature. 


Keywords Simple random sampling, two-phase sampling, auxiliary attribute, point bi- 
serial correlation, phi correlation, efficiency. 


1. Introduction 


There are some situations when in place of one auxiliary attribute, we have 
information on two qualitative variables. For illustration, to estimate the hourly wages we can 
use the information on marital status and region of residence (see Gujrati and Sangeetha 
(2007), page-311). Here we assume that both auxiliary attributes have significant point bi- 
serial correlation with the study variable and there is significant phi-correlation (see Yule 
(1912)) between the auxiliary attributes. The use of auxiliary information can increase the 
precision of an estimator when study variable Y is highly correlated with auxiliary variables 
X. In survey sampling, auxiliary variables are present in form of ratio scale variables (e.g. 
income, output, prices, costs, height and temperature) but sometimes may present in the form 
of qualitative or nominal scale such as sex, race, color, religion, nationality and geographical 
region. For example, female workers are found to earn less than their male counterparts do or 
non-white workers are found to earn less than whites (see Gujrati and Sangeetha (2007), page 
304). Naik and Gupta (1996) introduced a ratio estimator when the study variable and the 
auxiliary attribute are positively correlated. Jhajj et al. (2006) suggested a family of 
estimators for the population mean in single and two-phase sampling when the study variable 
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and auxiliary attribute are positively correlated. Shabbir and Gupta (2007), Singh et al. 
(2008), Singh et al. (2010) and Abd-Elfattah et al. (2010) have considered the problem of 


estimating population mean Y taking into consideration the point biserial correlation 
between auxiliary attribute and study variable. 
2. Some Estimators in Literature 


In order to have an estimate of the study variable y, assuming the knowledge of the 
population proportion P, Naik and Gupta (1996) and Singh et al. (2007) respectively, 
proposed following estimators: 


_(P, 
eae ies 
P, 


(2.1) 
ag} BB 
2 2 
(2.2) 
a P, - 
t3 = yexp mae 
1+ Py (2.3) 
t;= yen 2 ; 
p, +P, (2.4) 


The Bias and MSE expression’s of the estimator’s t, G=1, 2, 3, 4) up to the first order of 
approximation are, respectively, given by 
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correlation between , and 0, respectively, corresponding to the population phi-covariance 


N 
and phi-correlation S,,, = — > (6; —P, X,; -P,) 
ae ed 


Malik and Singh (2012) proposed estimators ts and ts as 
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Pi) \ (2.13) 


B, By 
= P, —- —P 
t ~Fen( 72) en P27) 
iv Py P2 2 (2.14) 


where @,,0,,B, and, are real constants. 


The Bias and MSE expression’s of the estimator’s t, and t, up to the first order of 


approximation are, respectively, given by 
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3. The Suggested Class of Estimators 


(2.16) 


(2.17) 


(2.18) 


Using linear combination of t, (i =0,1 2), we define an estimator of the form 


t= yet eH 


i=0 


3 
Such that, }’w,; =1 and w; €R 
i=0 


Where, 
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and t, = es] 
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no) 
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where w, (i =0,1,2) denotes the constants used for reducing the bias in the class of 


estimators, H denotes the set of those estimators that can be constructed from t (i = 0,1,2) 


and R denotes the set of real numbers (for detail see Singh et. 
L,G=1,2....,8) are either real numbers or the functions of the known 


auxiliary attributes. 


Expressing tp in terms of e’s, we have 


Wo ip Ww, (1 f. Qe, ye (1 + Pre, ie 


t, =Y(1+e,) +w,exp(—0., [1+8,e, I" ) 


P 


exp (-6.¢, [1+0,¢, ies ) 


where, 


al (2008)). Also, 
parameters of the 


(3.3) 
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After expanding, Subtracting Y from both sides of the equation (3.3) and neglecting the term 
having power greater than two, we have 


(., — Y)= Yle; = w,(a,9,¢; + 1, ,€))— w,(8,0,e, - B,0,e,)]| 
(3.4) 


Squaring both sides of (3.4) and then taking expectations, we get MSE of the estimator t, up 


to the first order of approximation, as 


MSE(t, )= ¥f[w?T, + w2T, +2w,w,T, -2w,T, -2w,T, | 


(3.5) 
where, 
_ LL, -L,L, 
ae OE 
L,L.-L,L 
caso 
i2 3 (3.6) 
and 
L, =, a;C,, +9303C), +20,0,0,9k,C,, 
L, =6'Bice, +05B5c,, — 28 ,B,0,9,k,C,, 
L, = a1B,0,C), — 0.,8,8,C;, + 58,9,0,k,C), — 0,05B,k CC), 
L,= ak ,.,C., +09 k,,,.C,, 
L, =B,0,k,,C —B,0,k,C., 
(3.7) 
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4. Empirical Study 


Data: (Source: Government of Pakistan (2004)) 


The population consists rice cultivation areas in 73 districts of Pakistan. The variables 
are defined as: 


Y= rice production (in 000’ tonnes, with one tonne = 0.984 ton) during 2003, 


P, = production of farms where rice production is more than 20 tonnes during the year 2002, and 
P, = proportion of farms with rice cultivation area more than 20 ha during the year 2003. 
For this data, we have 

N=73, Y =61.3, P,=0.4247, P,=0.3425, S}=12371.4, Sj =0.225490, Sj, =0.228311, 
Ppp, =0-621, p,,, =0.673, p,=0.889. 


Table 4.1: PRE of different estimators of Y with respect to y. 


CHOICE OF SCALERS, when Yo =9 W, =1w, =90 
0, aL, Ls; i ibe 1s PRE’S 
0 I 1 0 179.77 
1 0 1 0 162.68 
1 1 1 1 1 1 156.28 
-1 1 1 0 1 0 112.97 
| | C,, Pop, C,, Pop, 178.10 
1 1 NP, Kn, NP, Kis 110.95 
-1 1 NP, f NP, f 112.78 
-1 1 N a N Ks, 112.68 
I l NP, B NP, P, 119.32 
1 1 n P, n P, 115.32 
-1 1 N Bi N Pov, 112.38 
of 1 n P, n P, 113.00 
-1 1 N P, N P, 112.94 

When, “0 =0Ow, =0w, =1 
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B, B, L, L, L, L, PRE’S 
I 0 I 0 I 0 141.81 
0 I I 0 I 0 60.05 
I A I 0 I 0 180.50 
I -1 I I I I 127.39 
I ‘l I I I 0 170.59 
| I C,, Pop, C,, Pop, 143.83 
I Al NP, Kv NP, | Ky. | 179.95 
1 A NP, f NP, f 180.52 
1 A N Kv N Ky. | 180.56 
I “I NP, P, NP, P, 180.53 
1 -1 n P n P, 179.49 
I “l N Pow N Paw, 180.55 
1 -1 n P n P, 180.36 
I Al N P, N P, 180.57 

When, “0 =9W;=0W,=1 also 1 @=1,2,..,8) = 

a, =a,=B, =B,=1 PRE(t, )=183.60 


5. Double Sampling 

It is assumed that the population proportion P; for the first auxiliary attribute , is 
unknown but the same is known for the second auxiliary attribute o> . When P; is unknown, it 
is some times estimated from a preliminary large sample of size n’on which only the 
attribute ¢,is measured. Then a second phase sample of size n (n<n’) is drawn and Y is 


observed. 
£. eg . 
Let p; = — Labi = 1,2). 
i=l 


The estimator’s ti, t2, t3 and t, in two-phase sampling take the following form 


ta = iE 
Pi (5.1) 
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-(P, 
Cao =-¥) 
P2 (5.2) 
ty =Fou{ 2B B) 
P, +P, (5.3) 
t= Fong PP Le | 
p, +P. (5.4) 


The bias and MSE expressions of the estimators tai, ta2, tas and ta4 up to first order of 


approximation, are respectively given as 


B(t,,)= ¥,C;, lk, | 


(5.5) 

B(ty,)= ¥£,C>, ! ~Ky, (5.6) 
=, HG? 
Bisa YE re i-K,., | (5.7) 
B(tga)= Vis 2 [1+ Kp a 5.8) 
MSE(t,,)=Y [f,C? +£,C3,(1-2K,, } (5.9) 
MSE(t,,)= Tine ed iC. ( ~ 2Kiy, ) (5.10) 
= Cc? 

MSE(t,,)=¥ |£,C? +f, oa ~4K ,, ) 

(5.11) 

2 F 
MSE(tg4)=Y | f,C2 +£3 24 , weak 
: (5.12) 
where, 
fi? 2 , 2 1 n! ; 
an = = Tye 
$= Nlbs-p), $5, = Els v5). 
ie | 1 

(p23 es 
: n N’ 2 n n 
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The estimator’s ts and ts, in two-phase sampling, takes the following form 


1 \my P m, 
tas = {2 [2 
Py P2 (5.13) 


' ny ' P Ny 
lig = yon 2 | osf 2 — | 
Pit+Ph p2 +P, (5.14) 


Where m,,m,,n, and n, are real constants. 


The Bias and MSE expression’s of the estimator’s t,, and t,, up to the first order of 


approximation are, respectively, given by 


= my m m; m 
B(tys)= Yc; [= + a —m,K,,, }: £,Cp, [m ~ Be 7 mak (5.15) 


(5.16) 

MSE(t,;)= Y[F,c2 +£,C, (m? —2m,K )+ fC? (in? —2m,K,,,, J (5.17) 
= 2 n; 2 n} 2 

MSE(t,,)=Y | £,C; +f, Zea MK ye, Co, + fa] Zot Kaw, Ih, (5.18) 


6. Estimator tpa in Two-Phase Sampling 


Using linear combination of t,,(i=0,1,2), we define an estimator of the form 


3 
toa = > hyty €H 
i=0 


(6.1) 
3 
Such that, Sb =1 andh,eR (6.2) 
i=0 
where, 
= Lp ab, |) LiPy eh, | 
a ae a : 
Lip,+Lb, | LL,p,+L, | 


(Lp',+L,)—(Lp, +L,) |" : Peete #1) | 
(L,p'|+L,)+(Lsp, +L¢) | (L,p',+L,)+(L,P, +L,) | 


and t,, = ex) 


where h, (i — 0,1,2) denotes the constants used for reducing the bias in the class of estimators, 


H denotes the set of those estimators that can be constructed from t,(i=0,1,2) and R 
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denotes the set of real numbers (for detail see Singh et. al. (2008)). Also, L,(i =1,2,...,8) are 


either real numbers or the functions of the known parameters of the auxiliary attributes. 


Expressing tpa in terms of e’s, we have 


t= Y(I +ey Ih, +h, (1 +Q,e' yn (1 + Qe; ee (1+ Oy 


+ h,expl0, [e',-e, 1+, (e',-e, |" y exp(0,e', [1+6,e’, ])” (6.3) 


After expanding, subtracting Y from both sides of the equation (6.3) and neglecting the 
terms having power greater than two, we have 


(t,, a Y)= Yle, +h, (m,@,e', —M,9,e, — M,0,€", )+ h, (n,0,e' —n,0,e, +n,0,e', )I 
(6.4) 


Squaring both sides of (6.4) and then taking expectations, we get MSE of the estimator t, up 


to the first order of approximation, as 


MSE(t,,)=¥ [h?R, +h3R, +2h,h,R, + 2h,R, +2h,R, | 


(6.5) 
'  R|R,-R? 
Where, ys 2 BARROS: 
2 
R,R,-R; (6.6) 
and 
R, =o;m/f,C;, +3m36,C,, 
R, =O;njf,C? +03n3f,C>, 
k= m,n,f,9,0,C,, -n,m,@,0,f,k,C,, (6.7) 


R, =—m,9,f;k,,,C;, —m,9,f,k 


2 
pb C,, 


R= Oi KC. re nO;t kc. 


Data: (Source: Singh and Chaudhary (1986), p. 177). 


The population consists of 34 wheat farms in 34 villages in certain region of India. The 
variables are defined as: 


y = area under wheat crop (in acres) during 1974. 


P, = proportion of farms under wheat crop which have more than 500 acres land during 1971. 


and 
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P= proportion of farms under wheat crop which have more than 100 acres land during 1973. 
For this data, we have 

N=34, Y =199.4, P,=0.6765, P,=0.7353, S;=22564.6, S; =0.225490, S;, =0.200535, 

Ppp, =0599, P4, =0.559, p,=0.725. 


Table 6.1: PRE of different estimators of Y with respect toy 


CHOICE OF SCALERS, when Po = 9h, = 1h, =9 
m, m, L, L, eA L, PRE’S 
0 I I 0 108.16 
I 0 I 0 121.59 
1 1 1 1 1 I 142.19 
1 1 I 0 I 0 133.40 
1 1 an Pos Cr Pos, 144.78 
| I NP, K,,, NP, | K,. | 136.90 
1 1 NP, i NP, f 133.30 
1 1 N Kv N K wy 135.73 
1 1 NP, P; NP, P, 137.09 
1 1 n P, n P, 138.23 
1 1 N Po N pis 135.49 
1 1 n P, n P, 138.97 
1! 1 N P, N P, 135.86 
When, hy =Oh, =Oh, =1 
n, n, i be L, Ly PRE’S 
I 0 I 0 I 0 130.89 
0 a I 0 I 0 108.93 
1 “4 I 0 I 0 146.63 
1 -] 1 1 1 I 121.68 
1 -] 1 1 1 0 127.24 
| -d C,, ps C, | Pp, | 123.43 
/ =! NP, K.,, NP, | K,,. | 145.49 
1 -] NP, f NP, f 146.57 
1 -] N a N Ky. | 145.84 
1 -1 NP, P, NP, P, 145.43 
1 -1 n P, n P, 145.03 


— 


5 


Rajesh Singh m Florentin Smarandache (editors) 


I af N Dis N Pre, 145.92 
I -1 n P, n P, 144.85 
it -1 N P, N Pp; 145.80 

When, o =9h, =0b, =1 ago L, (i =1,2.....8)=1 

m, =m, =n, =n, =1 PRE(t, , )=154.28 


7. Conclusion 


In this paper, we have suggested a class of estimators in single and two-phase 
sampling by using point bi serial correlation and phi correlation coefficient. From Table 4.1 
and Table 6.1, we observe that the proposed estimator tp and tpa performs better than other 


estimators considered in this paper. 
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